Conversation
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #17732 +/- ##
============================================
+ Coverage 40.55% 40.56% +0.01%
- Complexity 2574 2575 +1
============================================
Files 5179 5179
Lines 349896 349918 +22
Branches 44727 44731 +4
============================================
+ Hits 141890 141954 +64
+ Misses 208006 207964 -42 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
luoluoyuyu
left a comment
There was a problem hiding this comment.
Review summary
Reference counting for shared JARs fixes incorrect removal of existedJarToMD5 when only one of several UDFs is dropped. rebuildJarMetadataFromUDFTable() keeps jar metadata consistent after snapshot load. New unit tests cover the shared-jar and snapshot paths.
Two small suggestions below; otherwise this looks good to merge.
| } | ||
|
|
||
| private void addJarReference(String jarName, String jarMD5) { | ||
| existedJarToMD5.putIfAbsent(jarName, jarMD5); |
There was a problem hiding this comment.
putIfAbsent keeps the first MD5 for a jar name but always increments the reference count. That is fine when validate() runs before addUDFInTable, but if addJarReference is ever called without that check, ref count and MD5 could diverge.
Consider rejecting a conflicting MD5 inside addJarReference (same check as in validate() at lines 107-115) so this helper is safe on its own.
| deserializeExistedJarToMD5(fileInputStream); | ||
|
|
||
| udfTable.deserializeUDFTable(fileInputStream); | ||
| rebuildJarMetadataFromUDFTable(); |
There was a problem hiding this comment.
After deserializeExistedJarToMD5, rebuildJarMetadataFromUDFTable() clears both maps and rebuilds from udfTable. That makes udfTable the source of truth on load.
A short comment here explaining that deserialized existedJarToMD5 is intentionally discarded would help future readers.



Description
This PR fixes UDF jar metadata handling in
UDFInfowhen multiple UDFs share the same jar.Background
UDFInfopreviously tracked uploaded jars only withjarName -> md5. When a UDF was dropped, the jar metadata wasremoved immediately. This breaks the case where multiple UDFs reference the same jar:
reappear after restart
Changes
This PR introduces reference counting for shared UDF jars in
UDFInfo.existedJarToReferenceCountto track how many UDFs are using each jarThis keeps shared jar metadata consistent across normal create/drop operations and snapshot recovery.
Tests
Updated
UDFInfoTestto cover the shared-jar cases:This PR has:
for an unfamiliar reader.
for code coverage.
Key changed/added classes (or packages if there are too many classes) in this PR